Search CORE

56 research outputs found

MixtureTree: a program for constructing phylogeny

Author: Bruce G Lindsay
DF Robinson
DL Swofford
F Ronquist
J Felsenstein
J Felsenstein
J Felsenstein
J Li
JP Huelsenbeck
Michael S Rosenberg
O Harismendy
RR Hudson
S Geman
S Holmes
S Kumar
SC Chen
Shu-Chuan Chen
T Margush
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background MixtureTree v1.0 is a Linux based program (written in C++) which implements an algorithm based on mixture models for reconstructing phylogeny from binary sequence data, such as single-nucleotide polymorphisms (SNPs). In addition to the mixture algorithm with three different optimization options, the program also implements a bootstrap procedure with majority-rule consensus. Results The MixtureTree program written in C++ is a Linux based package. The User's Guide and source codes will be available at <url>http://math.asu.edu/~scchen/MixtureTree.html</url> Conclusions The efficiency of the mixture algorithm is relatively higher than some classical methods, such as Neighbor-Joining method, Maximum Parsimony method and Maximum Likelihood method. The shortcoming of the mixture tree algorithms, for example timing consuming, can be improved by implementing other revised Expectation-Maximization(EM) algorithms instead of the traditional EM algorithm.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The statistical neuroanatomy of frontal networks in the macaque

We were interested in gaining insight into the functional properties of frontal networks based upon their anatomical inputs. We took a neuroinformatics approach, carrying out maximum likelihood hierarchical cluster analysis on 25 frontal cortical areas based upon their anatomical connections, with 68 input areas representing exterosensory, chemosensory, motor, limbic, and other frontal inputs. The analysis revealed a set of statistically robust clusters. We used these clusters to divide the frontal areas into 5 groups, including ventral-lateral, ventral-medial, dorsal-medial, dorsal-lateral, and caudal-orbital groups. Each of these groups was defined by a unique set of inputs. This organization provides insight into the differential roles of each group of areas and suggests a gradient by which orbital and ventral-medial areas may be responsible for decision-making processes based on emotion and primary reinforcers, and lateral frontal areas are more involved in integrating affective and rational information into a common framework

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

UCL Discovery

PubMed Central

New resampling method for evaluating stability of clusters

Author: A Bhattacharjee
A Thalamuthu
B Efron
F Tschentscher
GC Tseng
H Pruscha
H Schneider
Irina M Gana Dresen
J Handl
J Quackenbush
JC Gower
JH Ward
Johannes Huesing
K Zhang
Karl-Heinz Joeckel
L Hubert
LM McShane
M Bittner
M Smolkin
Markus Neuhaeuser
MB Eisen
MK Kerr
PHA Sneath
RR Sokal
S Datta
S Datta
S Datta
S Dudoit
S Monti
T Margush
T Sørensen
Tanja Boes
WM Rand
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Hierarchical clustering is a widely applied tool in the analysis of microarray gene expression data. The assessment of cluster stability is a major challenge in clustering procedures. Statistical methods are required to distinguish between real and random clusters. Several methods for assessing cluster stability have been published, including resampling methods such as the bootstrap. We propose a new resampling method based on continuous weights to assess the stability of clusters in hierarchical clustering. While in bootstrapping approximately one third of the original items is lost, continuous weights avoid zero elements and instead allow non integer diagonal elements, which leads to retention of the full dimensionality of space, i.e. each variable of the original data set is represented in the resampling sample. Results Comparison of continuous weights and bootstrapping using real datasets and simulation studies reveals the advantage of continuous weights especially when the dataset has only few observations, few differentially expressed genes and the fold change of differentially expressed genes is low. Conclusion We recommend the use of continuous weights in small as well as in large datasets, because according to our results they produce at least the same results as conventional bootstrapping and in some cases they surpass it.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Constructing majority-rule supertrees

Author: A Purvis
AD Gordon
BR Baum
C Semple
CG Sibley
D Bryant
D Gusfield
D Gusfield
D Gusfield
D Pisani
David Fernández-Baca
DF Robinson
DG Brown
E Danna
EN Adams
F Delsuc
FR McMorris
G Sierksma
GB Nunn
J Dong
JA Cotton
JA Cotton
Jianrong Dong
JP Barthélemy
M Kennedy
M Wilkinson
M Wilkinson
MA Ragan
MA Steel
MdL Brooke
N Amenta
ND Pattengale
ORP Bininda-Emonds
P Goloboff
PA Goloboff
S Sridhar
T Margush
V Ranwez
W Day
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Supertree methods combine the phylogenetic information from multiple partially-overlapping trees into a larger phylogenetic tree called a supertree. Several supertree construction methods have been proposed to date, but most of these are not designed with any specific properties in mind. Recently, Cotton and Wilkinson proposed extensions of the majority-rule consensus tree method to the supertree setting that inherit many of the appealing properties of the former. Results We study a variant of one of Cotton and Wilkinson's methods, called majority-rule (+) supertrees. After proving that a key underlying problem for constructing majority-rule (+) supertrees is NP-hard, we develop a polynomial-size exact integer linear programming formulation of the problem. We then present a data reduction heuristic that identifies smaller subproblems that can be solved independently. While this technique is not guaranteed to produce optimal solutions, it can achieve substantial problem-size reduction. Finally, we report on a computational study of our approach on various real data sets, including the 121-taxon, 7-tree Seabirds data set of Kennedy and Page. Conclusions The results indicate that our exact method is computationally feasible for moderately large inputs. For larger inputs, our data reduction heuristic makes it feasible to tackle problems that are well beyond the range of the basic integer programming approach. Comparisons between the results obtained by our heuristic and exact solutions indicate that the heuristic produces good answers. Our results also suggest that the majority-rule (+) approach, in both its basic form and with data reduction, yields biologically meaningful phylogenies.</p

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

NetMHCpan, a Method for Quantitative Predictions of Peptide Binding to Any HLA-A and -B Locus Protein of Known Sequence

Binding of peptides to Major Histocompatibility Complex (MHC) molecules is the single most selective step in the recognition of pathogens by the cellular immune system. The human MHC class I system (HLA-I) is extremely polymorphic. The number of registered HLA-I molecules has now surpassed 1500. Characterizing the specificity of each separately would be a major undertaking.Here, we have drawn on a large database of known peptide-HLA-I interactions to develop a bioinformatics method, which takes both peptide and HLA sequence information into account, and generates quantitative predictions of the affinity of any peptide-HLA-I interaction. Prospective experimental validation of peptides predicted to bind to previously untested HLA-I molecules, cross-validation, and retrospective prediction of known HIV immune epitopes and endogenous presented peptides, all successfully validate this method. We further demonstrate that the method can be applied to perform a clustering analysis of MHC specificities and suggest using this clustering to select particularly informative novel MHC molecules for future biochemical and functional analysis.Encompassing all HLA molecules, this high-throughput computational method lends itself to epitope searches that are not only genome- and pathogen-wide, but also HLA-wide. Thus, it offers a truly global analysis of immune responses supporting rational development of vaccines and immunotherapy. It also promises to provide new basic insights into HLA structure-function relationships. The method is available at http://www.cbs.dtu.dk/services/NetMHCpan

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Copenhagen University Research Information System

Online Research Database In Technology

Split-based computation of majority-rule supertrees

Author: A Kupczok
A Kupczok
AG Rodrigo
Anne Kupczok
B Holland
BR Baum
C Semple
C Semple
CA Meacham
CA Phillips
CJ Creevey
CJ Creevey
D Bryant
D Fitzpatrick
D Pisani
D Wu
DF Robinson
DH Huson
DL Swofford
E Bapteste
GU Yule
HA Ross
HT Lin
J Dong
J Dong
J Dong
JA Cotton
JL Thorley
M Kennedy
M Steel
M Wilkinson
M Wilkinson
M Wilkinson
MA Ragan
MJ Sanderson
MJ Sanderson
MS Bansal
MS Waterman
N Galtier
ORP Bininda-Emonds
P Puigbò
PA Goloboff
R Beck
RB Davis
RDM Page
T Margush
WF Doolittle
WJ Baker
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Supertree methods combine overlapping input trees into a larger supertree. Here, I consider split-based supertree methods that first extract the split information of the input trees and subsequently combine this split information into a phylogeny. Well known split-based supertree methods are matrix representation with parsimony and matrix representation with compatibility. Combining input trees on the same taxon set, as in the consensus setting, is a well-studied task and it is thus desirable to generalize consensus methods to supertree methods. Results Here, three variants of majority-rule (MR) supertrees that generalize majority-rule consensus trees are investigated. I provide simple formulas for computing the respective score for bifurcating input- and supertrees. These score computations, together with a heuristic tree search minmizing the scores, were implemented in the python program PluMiST (Plus- and Minus SuperTrees) available from <url>http://www.cibiv.at/software/plumist</url>. The different MR methods were tested by simulation and on real data sets. The search heuristic was successful in combining compatible input trees. When combining incompatible input trees, especially one variant, MR(-) supertrees, performed well. Conclusions The presented framework allows for an efficient score computation of three majority-rule supertree variants and input trees. I combined the score computation with a heuristic search over the supertree space. The implementation was tested by simulation and on real data sets and showed promising results. Especially the MR(-) variant seems to be a reasonable score for supertree reconstruction. Generalizing these computations to multifurcating trees is an open problem, which may be tackled using this framework.</p

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

IST Austria: PubRep (Institute of Science and Technology)

A Differentiation-Based Phylogeny of Cancer Subtypes

Histopathological classification of human tumors relies in part on the degree of differentiation of the tumor sample. To date, there is no objective systematic method to categorize tumor subtypes by maturation. In this paper, we introduce a novel computational algorithm to rank tumor subtypes according to the dissimilarity of their gene expression from that of stem cells and fully differentiated tissue, and thereby construct a phylogenetic tree of cancer. We validate our methodology with expression data of leukemia, breast cancer and liposarcoma subtypes and then apply it to a broader group of sarcomas. This ranking of tumor subtypes resulting from the application of our methodology allows the identification of genes correlated with differentiation and may help to identify novel therapeutic targets. Our algorithm represents the first phylogeny-based tool to analyze the differentiation status of human tumors

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central